The Filtering Approaches for the Improved Boyer-Moore Approximate String Matching

نویسنده

  • Yi Kung Shieh
چکیده

The Boyer-Moore algorithm is to solve exact string matching. Here, the Bad Character Rule of the Boyer-Moore algorithm is extended to solve approximate string matching. Although Tarhio and Ukkonen introduce a basic algorithm, it is similar to the Horsool algorithm. We utilize the concept of their algorithm to implement the Bad Character Rule, and we will obtain a new shift length. When the window needs to be shifted in filtering stage, there is a chance to shift larger. This paper also explains two simple filtering approaches, and we easily combine any one of the filtering method to our algorithm. These filtering rules are easy to understand. One of them comes from the obvious concept of the definition of edit distance. Another uses a special relationship between edit distance and Hamming distance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Boyer-Moore String Matching

The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm is shown (under a mild independence assumption) to solve the pro...

متن کامل

String Matching in the DNA Alphabet

Searching for occurrences of string patterns is a common problem in many applications. Various good solutions have been presented for string matching. The most efficient solutions in practice are based on the Boyer–Moore algorithm.1 A typical question in molecular biology is whether a given sequence has appeared elsewhere. In the following, we will concentrate on searching for exact occurrences...

متن کامل

Enhanced Pattern Matching Performance Using Improved Boyer Moore Horspool Algorithm

In computer science, the Boyer–Moore–Horspool algorithm is an algorithm for finding substrings in strings. A pattern matching problem can be classified into software and hardware based on implemental methods. It is important of enhance pattern matching performance. This paper proposes enhanced pattern matching performance using improved Boyer Moore Horspool Algorithm. It combines the determinis...

متن کامل

Approximate String Matching with Reduced Alphabet

We present a method to speed up approximate string matching by mapping the factual alphabet to a smaller alphabet. We apply the alphabet reduction scheme to a tuned version of the approximate Boyer– Moore algorithm utilizing the Four-Russians technique. Our experiments show that the alphabet reduction makes the algorithm faster. Especially in the k-mismatch case, the new variation is faster tha...

متن کامل

String Matching Rules Used by Variants of Boyer-moore Algorithm

String matching problem is widely studied problem in computer science, mainly due to its large applications used in various fields. In this regards many string matching algorithms have been proposed. Boyer-Moore is most popular algorithm. Hence, maximum variants are proposed from Boyer-Moore (BM) algorithm. This paper addresses the variant of Boyer-Moore algorithm for finding the occurrences of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014